Battling Steganography
An anonymous reader submitted a fairly thin little story about a researcher who is Battling
Steganography. I can certainly see the appeal of the study but it really seems like a needle in a hay stack sort of project. And when you actually can detect one technique, new and better techniques will crop up and take its place.
The guy in the photograph has no eyes! Maybe he stared at his monitor a little bit too long. In fact, is it just me or does he look like a cardboard cutout. Very suspicious. I bet there's a hidden message imbedded in that picture!
Last I heard, the FBI doesn't go around busting people for passing around what might be secret messages. I know there's been complaining about a general erosion of rights and privacy in the US, but I doubt it's gotten that bad.
??? I think you may not understand what ironic means, or what the poster means, or what AC is. (1) AC is a means to protect privacy. (2) The AC poster is against people tring to infringe on privacy. (3) Ironic applies when two things are unexpectedly opposed, not when they are exactly what you'd expect.
... I wanna start *using* stenography! Won't some enterprising Karma Whore throw us a couple links?
Was it just me, or did the article make it seem like anyone that would use steganography would be a criminal? Since when in a 'free' country should the ability to hide a message be of interest to the "legal community"?
It's hard to tell the cool to chill, my favorite hotel room has a view to an ill.
So what if he can predict the likelyhood of a hidden message. He still can't decipher it.
Staganogrphy as a whole seems unneccessary and overlike complex. I couldn't care less if you can see my message. You'll have a hell of a time reading it, thanks to encryption.
rieu ro,SZE98U=[GMLC #$%*UJHNMPO(I&%$sdfghjkl
Understand what I'm saying?
Let's say I wanted a message to be available to a wide number of people, hidden with stenography, and encoded as well. I pick a image, such as an X10 ad, that could be easily found from a "legit" source. I encode my message, then hide the encoded message in the least significant bit of the color for each pixel of the image - net effect, the ad looks just about the same, but there is data encoded in it.
If I knew messages were being passed this way, I might be able to get the message. First, I'd have to acquire the source image. Then, I would do my own diffs, and try to find the meaningful data. At that point, it's a decryption problem.
But how do I detect the data hiding in the first place? I would have to detect that a stream of data is very similar to another stream of data, but with minor differences.
Let's say I've solved that problem, and now have some signature, such that all identical data streams have the same signature, and very close streams have very close signatures. Then, I have to catalog data streams as they pass by, assign signatures, count instances of signatures, and call a hit when signatures are significantly close but not the same. A quick visual check can confirm the match.
Back to my original thought - instead of a data stream representing an image, what if the data stream represented the subject line of an e-mail, or the e-mail itself? A central database could manage signatures, automatically reported by e-mail clients that generate the signatures. When I get a new e-mail, I can get the signature for the header, and send it to the database. It could then report "that might be spam", and I could delete it without downloading the whole message. I could also download the message, upload the signature, and the database could say "that's probably spam", and it could be deleted or moved before it shows up in my Inbox. With many people uploading signatures, the database could quickly generate the average signature and the variance of the signatures, with people double-checking "Yes, I consider this to be spam".
A couple of benefits would be that, hopefully, the signature doesn't give much info about the text, so it would be safe to upload signatures for personal email. Also, it may be fairly easy to get enough responses to be statisically certain that email with a particular signature is spam, so that many would benefit from a randomly chosen few who choose to respond that an email is spam.
Of course, it may be impossible to generate that signature, or the signature may be long enough to identify the text of messages. Still, I could see that as a benefit of this kind of research. I'd also like a way to auto-respond "You have been found guilty of forwarding hoax emails. Please stop and desist." to just about everything my family sends me...
1.) Usenet/slashdot post seems oddly coherent.
2.) There's that certain, special _sumthin'_ about the fractal glint in Asia's lower lip....
3.) Lip-reader of your acquaintance says, "While he's doing that to Anthony Perkins in that doctored photo, Gore appears to be saying, "Al, your Bates are belong to U.S.."
4.) Snowcrashing.
Excuse me ? Did I wander into The Onion by mistake ?
Thanks for the complement. It was a lot harder than I'd expected it to be.
Certainly there are tools out there that put together random, sensical-looking text with specific patterns in word usage, punctuation, spacing, whatever, to encode messages, but to actually tweak a message with intrinsic meaning in itself is a bit more difficult.....
I like the way he claims a 90% success rate. Either the researcher is a moron or else the person writing the article has already beaten him there.
What if there were three encrypted messages in each image he processed? Finding one is useless, because the sender could put an easy message in and two extra that won't get caught.
Better yet: his algorithm could be giving him garbage hits and not be finding anything real. The pictures could be just pictures. Novel concept.
*whew* Moron alert - eleventy three o'clock.
An large number of people in this discussion are entirely missing the point of what Farid does.
Let's put it this way. If Farid alone can crack a variety of steganography, then the NSA or whoever it is who really want to invade your privacy. If he was trying to crack RSA or DES or PDF's ROT13 encryption, he would be praised - do you really think that steganography is somehow special?
So the article was rather uninformative. I've met Farid. He's a very cool guy. He's working against things like SDMI - which is a form of steganography. As part of a lecture he gave, he showed how to defeat various watermarking techniques for images (without getting arrested, even.)
Consider that when you say "battling steganography is battling privacy! We must hate him!" you are using the same logic that put the DMCA in place. Congratulations.
Win dain a lotica, en vai tu ri silota
Actually, the forum this story was extracted from is pretty much geared towards only generating PR, and not scientific exchange. Attacking Farid and/or Dartmouth for this is silly... this is how institutions generate attention and money for grants.
But it is especially silly since he does such a bangup job of putting his technical work on-line:
Farid's Publications
Well, Skylarov is currently in jail for breaking ROT13
Hey, remember the site on the net that had the lyrics to many songs that got shut down? Embedding the lyrics to Penny Lane is illegal
Robert
Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
That lack of certainty really isn't that big an issue, because with a good idea of what percentage of images are false positives it would be fairly simple to look for image sources where the percentage was well outside the norm.
All of this would of course be very resource intensive and would require access to large amounts of data (Omnivore, anyone?) but it's far from outside the capabilities of most governments.
Possibly also of interest to people is Benford's Law, which relates to the distribution of numbers - turns out that in many areas it's very simple to identify real data vs random data, because real data has some definite non-random properties.
fencepost
just a little off
Anonyone wanna bet that Farid is the AC who submitted the story? I took a programming class from him a few years ago...he seemed pretty full of himself then, too.
Good steganography is essentially the same as adding random noise to an image. You can structure the noise any way you like. There are lots of images that plausibly contain lots of noise, for example images taken in low light and images scanned from film. As long as you don't insist on a very efficient steganographic embedding, there are undetectable steganographic methods. Farid's research is pointless, and it is scary to think that courts may start relying on it.
Hiding the message would be a form of encryption protected by that dastardly DMCA....want my hidden message? I'll sue you!
Try the following: 1. Go to the Dartmouth home page, 2. Search for Farid, 3. Click on Farid's link, 4. Click on the address for his home page. Obviously this 404 error page must have a hidden message. Results of wavelet compression analysis will be posted later.
I think yr stats are way too low -- or are humans not animals?
Sounds like this guy has managed to get a grant to look at p0rn. "No, I'm not looking at p0rn, I'm looking for hidden messages." :)
Keep Austin Weird!
funny sig ;)
So what? The very fact that the attacker has to look at TCP window sizes at all, or other ephemeral data, such as ping packet contents, is part of an overall strategy of making steganographic communications too expensive to isolate. It's an arms race; in this case, the sheer number of different possible techniques means that the information hider has the advantage, because the searcher has to look in so many places!
well, only lossy compression chuck's data. if you used gzip to compress an executable file then it had better come out the other end looking identical or someone will be annoyed. now if someone mp3'd that audio then fair enough. but you shouldn't generalise that all audio compression is lossy.
dave
If we're talking about digital cameras as carrier sources then it is not strictly true that the LSBs in an uncompressed ( that's where you should start ) image are truly random noise.
All but the most expensive dcams use some sort of mosaic filter over a monochrome sensor. The most common pattern is Bayer :
rgrgrgrg
gbgbgbgb
rgrgrgrg
gbgbgbgb
To get an RGB image the missing data are formed from neighboring cells. Various filters are used. The result is that there will be correlations with neighbors.
The generalized implication for steganography development is that the better the source characteristics are understood and quantified, the better the technique that can be designed for using that source as a carrier.
It looks like its time to kick it up a notch as far as stego goes. BTW the FBI has asked for and received funding for research into hidden data detection. Maybe this is one of their first academic grants.
m ( sort of anonymous but not a coward )
TCP windows are probably a bad place. They tend to follow very well defined behavior, and often only change in direct response to other packets in the stream. For example, during a one-way bulk data transfer, the senders window will rarely change at all. The receivers window will usually change only by the amount of data received in a packet. All very very predictable.
Suppose one gets caught with such an image. According to him, the technique has a 90% chance of success. So what about the 10%, wherein, one has no message encoded in an image, but triggers tha alarms anyway?
The 10% miss rate in and of itself should still represent plausable deniability. If you take standard legal practices, a 90% probability of a "match" is still weak enough that it would require other supporting evidence, circumstantial or otherwise to present a reasonable case.
If you get caught by the FBI, what can you say?
Caught how? It's not illegal to embed hidden messages in images, just as it's not illegal to hide a plot in pornography - though both are equally unlikely.
I just more more eveidence than this is required for a warrant to be issued.
IANAL, but a 90% probability that you're engaging in a perfectly legal activity doesn't seem, on its face, to meet the burden of probable cause necessary to perform a legal search and seizure.
My car gets 40 rods to the hogshead, and that's the way I likes it!
No, this would actually be really cool. Make an Apache module which automatically inserts something steganographically into every JPG it serves. Some people put encrypted data into the images, and others just direct it to read from randomly encrypted gibberish. Then the government has to deal with lots of script kiddies who think they are cool by embedding Brittney Spears mp3s into the images from their webpages.
So, how can the algorithms mentioned in the article (which is interesting, but rather short on facts...) distinguish between the noise added by a steganographically embedded encrypted message and the noise caused by a slightly underspecced A to D converter?
You're right, there isn't too much of a difference between random noise and an encrypted communication. If you had a pure digital stream that had just been converted from analog, you could stick data in the least significant bits and no one would be the wiser. For example, a CD is just a sequence of 16 bit words iterated 44,100 times a second; you could just replace the least significant bit in each word with bits from your hidden message and it would be indistiguishable from random noise.
The problem arises when you try to compress digital information. These compression algorithms use the most optimum way to represent data that they can find and discard the least significant data, so they would completely destroy the afore mentioned hidden message. To hide data in a compressed file you need to play with how the compression mechanism stores the data, and the resulting file is most probably not going to be optimally compressed when you're done. What this guy is doing is looking at how the information was compressed, extract the overlying data that was being stored, and making sure the compression algorithm was indeed optimal. If there are any odd quirks in the compressed data or it doesn't look like the compression was optimal, it may be because data is hidden inside.
I hope this is a good enough explanation. I'm short on the examples but the underlying ideas are pretty basic.
~~~
You might also want to check the techreports that I published about my research.
At HAL 2001, I presented on Detecting Steganographic Content on the Internet. You might like that.
Dartmouth certainly seems to know how to do PR. I would just like to know where their publications are.
You might say that 90% is no pretty significant. But considering how many actual images are there out there with actually no steganographic message, I think you'll actually end up persecuting more innocent people.
I just more more eveidence than this is required for a warrant to be issued.
Maybe a subliminal technique (executed right when explicit and weirdly aberrated information triggers your opponent's unencrypting reader) can obscurely manipulate mental awareness, negating data.
The same applies to steganography, IMHO. SOMEONE has to break it - it might as well be me.
grep -ri 'should work'
When a method of steganography is discovered, it is useless.
Yes, if the _method_ itself is discovered, it's useless. However, if each instance of the method's use is quantitatively/qualitatively different enough then the method itself may still be capable of generating additional useful instances even once some are discovered. In other words, if the pattern of uses of a particular method isn't obvious then the method itself remains safe even if some of its output is discovered. Of course, this requires a very sophisticated, dynamic, chaotic, magical method. Or maybe just many methods rolled into one.
Imagine trying to decipher the hidden messages in "The 5000 fingers of Dr. T.". It is a movie and as such contains the symbolism and iconography and messages of many individuals. Some of them are apparent, some of them covert, and some of them downright indecipherable.
Also, think about the Blade Runner/Ridley Scott "Is Deckard a replicant" business that lasted, well, right up until he told the world the answer. It is that sort of interpretation that someone hoping to decipher steganography would have to perfect. It's not just stuff like: Hi Everyone Likes Punch!
The only way to get messages out of such texts is intimate knowledge of the author(s) or intended recipients of the hidden meanings. By asking them, or sodium pentothal, or the NSA's computer simulation of everybody's brain.
I'm no cryptographer, but the most reliable and cost effective way to discover a secret is likely to investigate the people that know the secret, rather than try to divine meaning from a text that came into your hands.
I don't need large brains to have a good time.
If steganography can be made "turnkey", it'll work
for most of today's privacy requirements.
You might think that it'd be easy to detect,
or simple to prevent, but that's simply not true.
Unless someone lists all the ways in which one
can hide information, and a fantastically fast
approach to testing any given communication on the
net against those techniques. Otherwise, to
read a steganographically-encoded message,
each recipient will need to figure out which of
all the messages intercepted even includes the
data you're looking for, and what was used in
this particular instance. Hell, one might even
have two or more different techniques applied
in a single message. Like this message does.
Sort of.
....
well.. if the junk is properly stenographed so you can retreive that junk (although penny lane isn't junk, good song :) then you can use the lyrics and the knowledge of stenographic technique to restore the picture to it's former state, at which point you can run the stenography detecter again and get the real secret...
of course it gets more cunning when the data you remove stenagraphically is itself an image with stenographed data on it, and that data is...
and eschelon has a machine do do all this but completely missed your bombing plans which were the subject of the picture itself and not the stenographed data itself... hiding the wood in the tree's as it were.
dave
Apparently the guy is under the impression that steganography is only used by criminals.
Steganography is nothing new. People have been hiding secret messages in innocuous objects since time began. Naturally, various people want to prevent this, but the method's very nature makes it almost impossible to simply track.
Got Rhinos?
How about you actually *read* the article before you post?
I'd assume that they're working with photographs and photograph-like images (as opposed to stick-man drawings or something like that). In that case, the function could look at certain things that would appear in a photograph - colour borders, gradient, etc. If the picture consistently doesn't show what's expected, then that could be used to show that there's been some sort of change made to it. I don't know much about graphics analysis, so I couldn't say for sure, but I can see this working.
Last post!
It seems like Dr. Farid's research would have wide application in detecting and "battling" related technologies like digital watermarking in sounds and images. I wonder if the media companies will try to use the DMCA to bully him out of his research, as we've seen in similar cases.
Most cyphers are pseudo-random to some degree. Nearly all of them will pass various statistical tests for randomness and entropy measurements to some degree. How well they pass is another matter and it something on which one can construct an entropy signature.
In Steganography you want your plaintext to appear statistically as identical as you can to your chaff / image / noise stream. Creating a good match is difficult. Hany Farid, for example, is attempting to use various tests to identify plaintext within an image. With the right tests one should help identify which is noise and which is plaintext.
To combat Farid's method one needs a cryptographically strong PRNG or true random source. Then bias the output in a fashion that is identical to the noise (big handwave here ... this is hard to do well).
Finally mix the plaintext with the
biased stream and inject it into the
noise in a way that is known to you
and the receiver.
I have yet to come across something that could generate as random a number in as closed a space...
The next generation LavaRnd will give you that in a very compact space, using a patent-free algorithm and open source demo software. Final hacking is going on now. Code completion and demos will soon follow. Paper to be published sometime after ...
p.s.: Gotta love moderating. My original article stays a 1 and your reply gets a 2. Both are directly on topic while some joke gets a 4. Maybe moderation scores are a good source of random noise? :-)
chongo (was here)
While it's true that human beings can interpret images to mean something that a machine could never pick up on, that's not the thrust of the research being done here.
He is doing research into a very particular kind of steganography, whereby messages are concealed within an image via slightly altering the least significant bits of an image.
When you encode information in this way, somebody knowing how to extract it can pull out a message which is not subjective (as in the example of interpreted images given by another poster), but rather is very concrete.
There is some evidence that this form of encoding has been used to communicate information throughout terrorist cells.
What the researcher is doing is developing a method to detect when the LSB's in an image have been manipulated slightly. He is not trying to decode the message, but only to flag particular images as being suspicious.
Decoding would be a matter for someone completely different -- like the FBI, for instance.
His method does have applications, and if it is through alteration of LSB that a message is embedded in an image, it will apparently detect such 90% of the time.
This is a vast improvement over any existing methods I know of for detecting LSB manipulation.
So he's not quite looking for a needle in a haystack. He's examining millions of haystacks, and pinpointing the ones that probably *do* have needles in them.
Quite a large difference, really.
-l
Thus, in short, proving the absurdity of the DMCA.
Healthcare article at Kuro5hin
I don't see how anyone with a conscience could decide to intentionally try to destroy methods with which people can protect their privacy.
Remember, a good encryption algorithm will render its output indistinguishable from random bits. So, his techniques will work only as long as the data is not encrypted. Once the baddies(?) start encrypting data and putting it in there, he won't be able to detect the presence using statistical techniques.
I know, there's the problem of key distribution. But you could include the key itself as plain text in the first x number of bytes of your payload, followed by the actual data encrypted using DES/AES/TwoFish. Unless the decoder knows the length and the location of the key (something you can decide on beforehand), s/he won't be able to decode it.
> If you take a photo of a TV screen, it comes out black..
That depends on many factors including the speed of the film, apreature on your camera, shutter speed..... It is quite possible to take a good picture of a TV screen, you just have to make sure the shutter speed is long enough to get a whole frame and the aperature is wide enough to expose the film completely.
If you strong encrypted the data before placing it into a data stream, then you have made the task all that much harder because you now have no sensible data to extract, just seemingly random noise. Just a thought...
> This process led to the creation of a computer
> program that can determine the likelihood that
> a secret message has been hidden within an
> image.
So he can show that something is in there? That's not as big a deal as the article makes it out to be...half the time, you'll know the data has encrypted information embedded in it. The hard part is getting the info OUT OF the data, which the article doesn't really address.
Now we have more people looking at steganography. This can only make it more effective. Sure, the methods we have now might be broken but what about the next ones, the ones that don't show up on the statistical analysis that he appears to be using.
Bleh!
It occurs to me that all compression involves quantization.
If you consider the case rounding 0.5 to an integer, it's clear that either possible choice 1 or 0 is equally good, and in fact the best answer in that case is usually to pick one value at random so as not to add a consitant bias. Therefor, in these rare cases the resulting bits must, by definition, be completely orthagonal to any properties of the resulting image - you could change them all you like.
A stenography routine that did it's own compression and only changed these bits would, by definition, be undetectable.
So, with some fairly heavy constraints, undetectable stenography is inherently possible.
There must be various ways of making stenography routines that used this property, even routines that don't do the original compression, by finding lsb's that by some measures are really good candidates for having orginally been rounded from near 0.5 and only touching those.
What cha all think?
I demand a million helicopters and a DOLLAR!
You see? You see? Your stupid minds! Stupid! Stupid!
Wasn't that what the kids in Along came a Spider used to chat in class? Somekind of over-simplified version of it, anyways ^_^
Burn the land and boil the sea, you can't take the sky from me
Can you imagine using his techniques to search through Google's image archives, or perhaps a gnutella network just to see what is sitting out there?
This sounds like it could uncover yet another seedy underbelly of world culture.
I imagine there could potentially be millions of hidden messages out there that noone knows about.
They shaved a messenger's head, etched in a message, let his hair grow back, and sent him on his way.
Well theres quite a simple solution there, dont use lossy compression.
If you take a photo of a TV screen, it comes out black..
DMCA, It's fun to violate the DMCA, hey. Get your decoder ring, It's a criminal thing, You can teach your new cellmates to sing! Randy Chase
Second, I'm not sure how to react to this. I don't use steganography to hide information, nor do I encrypt my email normally. I guess it's good to know if the techniques used to do this are detectable or breakable, but if it was actually used on a large scale you can bet I'd be screaming, "Big Brother!!!"
Even Slashdot wants to hide some things
The fact that an image after altered can be detected via a mathematical function is true, but saying that it can be detected without having a source image to begin with? What If I take a picture of a random image and then stuff the message which was encrypted into the image. Voila undetectable. Randomness makes the perfect concealment.
I can see detectability from some of the crude software packages out there, but not the better ones that make sure the applied file is expanded to the size of the image and reversed.
Do not look at laser with remaining good eye.
For instance two diffent jpeg encoders, both at the same quality level will result in subtly different encodings of the same source image. If you take these two images, calculate the difference at each decoded pixel, and amplify the diffence (so that you can easily detect minue intensity differences) you'll see the signature of the differences between the encoding engines.
Now if I encode a message in the image (a 1 megapixel image, small by todays standards, can encode a 1 megabit steganographic message assuming only a 1 bit change in colour). If you could get the source image and do the above described difference calculation you would see the pattern representing the message.
If you pick the wrong source image (it LOOKS identical but was compressed slightly differently), you'll only reveal a combination of the signature and message.
Do whatever statistical examination of this noisy signature you want, I don't see how you can determine that the image concealed data. Well, unless you do an impressively poor job of concealing the data in the message. Encoding your message in a pure white gif, jpg or png would be a bad idea for instance.
Chris Kuivenhoven is a thief, beware
Seems to me that a watermark is a form of steganography. I wonder if these techniques would work for watermark detection?
neils provos (openbsd and openssh developer) has a stego detector based on similar principles (i.e., look for statistical anomalies in jpeg files).
in fact he is presenting a paper on the subject at the usenix security conference tomorrow.
unlike the dartmouth folks, who apparently think press reports are the proper medium for scientific interchange, provos makes his results publicly available; see
http://www.citi.umich.edu/techreports/
reports 01-1 and 01-4.
nobody
parturiunt montes, nascetur ridiculus mus
o/~ Join us now and share the software
And when you actually can detect one technique, new and better techniques will crop up and take its place.
That's like saying 'if somebody can break 56-bit keys, you can just increase the key length'. In other words, it's really not that simple. Firstly, you're assuming that there will always be new techniques. Secondly, you're suggesting that these new techniques will always be harder to detect than previous techniques. Thirdly, you're assuming the licensing model of such techniques will allow them to take the place of existing techniques.
In short, until you know what you're talking about, or are able to engage your brain, please shut up with your opinion, and just deliver articles and facts. Thanks.
The article stated that the guy used an algorithm to detect statistical variations and predict wether an image had steganographically hidden data 90% of the time.
How about a GIMP or Photoshop plugin to randomly insert junk data in any JPEG saved in order to make this technique useless? It'd be fun to the the NSA sit and fret over an image that apparently had a list of Warez traders and DMCA violators but instead contained the lyrics to 'Penny Lane'.
Better yet, how about an Apache module that does this same thing to every JPG it serves?
The point is, that as soon as it becomes common procedure to intercept images to check for steganography, those who use steganography will switch methods. I bet PGP data encoded in a JPG is a lot harder to detect, and infinitely harder to extract.
The next Slashdot story will be ready soon, but subscribers can beat the rush and slashdot the links early!
Huh?
Nosce te Ipsum
I think the legal community would be interested in anything that might help them find and interpret potential evidence. When evidence is properly confiscated, we now have the techniques to break locks on door and safes, we now have the techniques to crack certain types of cryptography, and hopefully we'll soon have the techniques to FIND a stack of steganographically-hidden evidence. It's pretty relevant to our legal system.
Yeah, I know, don't feed the trolls. Are they animals, or some form of insect life, I wonder?
http://members.tripod.com/steganography/stego.html
is a great place and has a software archive.
Do not look at laser with remaining good eye.
How could something like this be held up as any sort of evidince. From what I interpret of what this guy is trying to do is check if that there may be data by checking with compression rates, and randomness compairsons. But what if the photograph or audio file is inherently noisy? Or what if you use a poor implimetation of the compression algorithim?
With standard encryption, if you are in court you can be ordered to decrypt it, but if there is a chance where there is nothing there, they can't force you to do anything.
This just seems to be a waste of time to me.
Actually, randomness makes piss poor concealment. Any data encrypted by a decent algorithm looks random. And that makes it looks suspicious to the spooks ("Yup, he's sending another 10MB of "random" data to the anonymous remailer again.").
The whole point of steganography is to hide a message in innocuous data. The kind of data that people send most frequently, and is likely to go unnoticed. Stuff like digital photos, audio, etc.
Your average image has a fairly predicable amount of randomness in it. What he's done is basically found a statistical way to identify if an image has more randomness than you'd expect in a similar picture. Your random image would probably set off all kinds of alarm bells in his system.
(note: lemon juice was one of the first "invisible inks")
Dump the IRS - http://www.fairtax.org
Good encryption output approximates a pseudo-random number sequence. Just encrypt the image stream. Then embed the encrypted image into the picture and the result should be indistinguishable from noise.
Tell me how to beat that without having to: approximate the last bit, strip it, and run hundreds of hours of cryptanalysis on the approximate data.
-RCHF
u cn b a stngrfr!
Name your attachment letter.doc.pif and send it with the message "I send you this file in order to have your advice"
The article sure does make it seem like steganography is the work of the devil. But watermarking documents and sound files is endorsed by such fine members of the establishment as the RIAA and SDMI. So is steganography evil or good? It's just neutral, despite what the article says.
This guy should still be afraid of violating the DMCA. If he tries to detect steganographic images in a sound file, he might run afoul of the RIAA. He shouldn't even think about publishing his research.
How does open-sourcing a steganographic technique impact its usefulness? I suppose it would depend on the nature of the technique. For example, it seems open-source public-key encryption techniques don't compromise their usefulness simply by sharing their algorithms/source. Is this equally possible with steganography or must the methods remain more secretive?
Arrest this man! He has broken the most sacred of our nations laws.....the DMCA. This evil man has created willfully a method to defeat an encryption system. He is reverse engineering something for God's Sake!!! He must hang!
If you look closely in the subject of this post, it will reveal a hidden Satanic message!
Beware what you see!
This is an interesting idea, but surely any good encryption produces an output which is indistinguishable from random noise. So, how can the algorithms mentioned in the article (which is interesting, but rather short on facts...) distinguish between the noise added by a steganographically embedded encrypted message and the noise caused by a slightly underspecced A to D converter?
I'm honestly curious... has anyone got any links to a more detailed report on this?
Crap my underground pr0n ring is in danger now!! Better start embedding in mp3's...
.ph0x
---
ps -aux | grep mind
then only criminals will hide secrets in porn.
One should point out that you can also contextualize it, with a common base of painting for example - the use of certain background images or shades can have a meaning that a machine will miss, but a human can translate:
Picture of small boy holding a goose while reading a book = I am hungry for words.
Picture of a goose holding a book about a small boy = The feds are spying on me.
Picture done with buttons instead = no more bagels.
--- Will in Seattle - What are you doing to fight the War?
What this researcher is not mentioning is the false positive rate. This means how often the algorithm reports that a file contains steg when it actually doesnt. There are many tools out there for detecting steg, but their false positive rates render them useless for practicle use. I havent seen any tools under 10%, but I have seen some as high as 65%. This means that the tool says that 65% of all images are steg'd!
False positives are often simply a property of the mechanism that created the image in the first place. For example, certain graphic programs and digital camera's will ALWAYS produce files that look like they contain steg.
Some of the other posts here have mentioned using a carnivore like system with steg detection. With a modest false positive rate of 10%, imagine how many false positives you would have by searching just your office for a month. Not to mention the fact that once you have all of files, what do you do then? arrest everyone you sent a file that has a remote possiblity of being steg? You guys and gals can sleep a little safer because I seriously doubt the government has enough resources to look through 10% of every graphic or sound file that gets transmitted via email.
The courts are terribly unprepared to handle the new breed of digital criminals that has emerged, along with the rapid increase in low-cost and sophisticated digital technology," said Farid Well, I have to say, DUH!!
Wouldn't good voice recognition software help?
until I decided to let dinosaurs alone.
Seriously, whether it's typing profiles, mouse moves, misspellings, funny walks, all can be copied and can have inaccuracies that cause misidentification.
Besides, what would we do without steganographers? And steganographists? Subject them to stalactites and stalagmites by satellite?
--- Will in Seattle - What are you doing to fight the War?
My biggest fear is that someday people might be able to spread racy photos of motherboards around the world WITHOUT DETECTION! Will someone PLEASE think of the children.
-- sometimes AND gates turn me on.
I'm researching Steganography!
-- www.globaltics.net
Political discussion for a new world
I once saw a CIA statistic that claimed that Osama Bin-Lauden was resonsible for 23% of all "free porn" sites on the internet.
[The Weathermen ate my balls.]
If anyone wants more info on this kind of thing (information hiding) pick up a book by Simon Singh . I recommend The Code Book.
[shameless plug] Pounding Sand Tshirts. Get your Micro$oft satire here!
My beliefs do not require that you agree with them.
You know what I mean (or are you new here?)... If I'm trying to control access to something, isn't it illegal under the DMCA to circumvent my access control techniques? Suspicion of illegal activity is NOT ENOUGH reason (for non-law-enforcement types, at least) to go poking around in my data.
Here's an interesting article that mentions some steganographic pictures hidden on some ebay auctions! Bin Laden at work?
NSA, Pentagon, Police Fund Research Into Steganography
cpeterso
How come Dr. Farid is not
battling Guns?
Sounds like someone who should work for
a totalitarian government.
P-W-E-I-sation!
Given a certain state of network bandwidth, the quality of images transferred over the network is likely to increase as the ability to transmit that data increases. This means that anyone trying a large scale data mining for steganographic data, for example in a Carnivore-type application, would need to have many times the bandwidth of ALL the senders/recievers in order to analyze that much data.
That would make it so the only real application of this method would be for people you already suspect of sending steganographic data. You could direct the search toward them. However, then it is still trial and error to find which steganographic protocol they used, etc., and you're back to square one.
Maybe if the steganographic checking system was actually *intergrated* to the Carnivore system you could get somewhere. It might be a good way to search for messages that were "suspicious".
It is interesting, though, that this method is possible without knowing the individual steganographic protocols. It just seems that it would be too resource-intensive to deploy on a wide scale, and a wide scale is the only place it would be really more useful than trial and error.
"He's more machine now than man, twisted and evil."
You mean feds can now not only pass laws but enforce them over the whole of the internet? He works fast does this guy Bush!
No, your children are not the special ones. Nor are your pets.
What it boils down to is this:
The more the corporations, and their lackeys in government restrict freedom, the more determined those to preserve it will become, and the less effective their efforts will be.
For one thing, it's a challenge, and nothing inspires great accomplishments from hackers than waving the red flag.
=== The price of freedom is eternal vigilance
For literally hundreds of years encryption technologies have stayed just ahead of cryptanalysis technologies. We now have entirely new criterium by which we judge crypto, and new methods by which we develop it. It used to be considered good data hiding to hide a message in a cake or a bottle. For a several hundred years it was accepted that the strength of a cypher was in keeping the cypher itself secret, now its thought foolish to have a cypher whose security remains on the secrecy of anything but the key.
Current systems are based on complex, provable, mathematical models. Quite a departure from the Ceaser cypher and a secret bottle cap. In spite of this, though, we still occasionally come up with something like a faster method to solve knapsack cyphers and turn the world around.
If you have any question of the value of good steno/anti-steno or crypto/anti-crypto, just ask Mary Queen of Scotts or thousands of dead U-boat sailors.
So let's also do the inverse... ALTER a picture's statistical properties, without including any real data. Make it LOOK like there ought to be a steganographic message, when none really exists. This would be kind of like a more erudite version of the "jam Echelon" civil disobedience, but you can send pictures of your kids, pets, etc. to all your friends without them thinking "Why does this dude have a .sig about Lon Horiuchi selling cocaine and C4 to the Iraqis?" The only ones who'd waste time looking for messages that weren't there would be the people who were spying on your pictures to begin with.
then only criminals will hide secrets in porn.
Porn's good. Er, I mean for steganography that is... "I only use porn for security reasons".
As another thought, how about using TCP window pointers? You might only get a couple bits per TCP packet, but they can add up. This might be useful for key exchange, for instance. Also, there would be no lasting image (or whatever) subject to future recovery. On the other hand, you would have to watch out for proxies.
A dingo ate my sig...
We've all heard of "Security through obscurity", well his methods are "Detection through obscurity". Once his detector becomes public (or there is an open Oracle the way the SDMI challenge had) people will quickly alter their techniques to avoid it. Since there is virtually zero technical information in the article let me take a guess as to what he could be doing: Picture two rows of three pixels...
P11 P12 P13
P21 P22 P23
If the vertical rows have the same values *except* for the LSB of each [i.e. P11&0xFE == P21&0xFE && P21&0xFE == P22&0xFE && P13&0xFE == P23&0xFE], then the probability of an encoded message rises the more this condition exists thoughout the picture.
But it's easy enough to make an encoding algorithm smart enough to avoid that trap.
--Rob
I suggest that we flood the net with documents containing hidden bogus messages. Maybe an innocuous worm or virus would do the trick. It could seek out audio and image files and insert random messages. That should keep the spying computers of the government and other freedom hating organizations busy.
But wait a minute, seeing they can enact freedom squashing laws like the DMCA with impunity, what's to keep them from making steganography illegal? Resist Big Brother. Demand freedom always!
whatT? would Hidien tExt damage Your retinaAs embedded in youR imageEs If you diddeNt view theM under florecent lIghitiNg but insteaD when outside?
Moneyed corporations, non-working 'poor' and criminal prisoners are turning productive citizens into tax-slaves.
I have a st0rn ftp server with all sorts of kinky stuff!
At first glance, Battling Steganographyseemed like the title to a new Jurassic Park movie!
Have fun: Join D.N.A. (National Dyslexics Association)
Mod this up!!
The 1 minute explanation of entropy signature analysis is that it seeks to quantify in R^(n+m) space, the statistical properties of a stream of data by applying n statistical tests to the data. How well or poorly the data passes these tests helps identify the method of generation.
With the right tests and the same data one can distinguish between DES CBC and RC2 for example. With enough traffic, given two different streams of similar data (say encrypted human language text) one can also distinguish between cyphers as well.
Mixing a 3DES stream with random bits can be unmixed if the random generator is not cryptographically strong. For example if you use Linux's /dev/random to mix
random data with cryphertext, you can be
undone with sufficient entropy signature
analysis.
Mixing plaintext with random bits makes the
cracking job easier.
Using something non-random such as an image
makes the job even easier.
To do Steganography well, you need a cryptographically strong random or pseudo-random number generator, a good mixing function and method to transform your hidden text into a sufficiently similar data stream so that its entropy signature is similar to that of your base pattern / image / noise stream. Doing Steganography really well is hard.
Reducing the signal to noise ratio and keeping the amount of signal text to a minimum also helps, BTW.
In case anyone was wondering why I spend time working with LavaRnd, cryptographically strong PRNGs, Lava Lite ® lamps and other random oddiments ... :-)
chongo (was here)
This only applies to an attack against one form of steganography, not the field in a whole. The incredibly ancient art of code words and hidden meanings will still continue as before, it just means that hiding bits in GIFs will have to get a little bit more clever. Probably the next generation of stego software will have built-in wavelet algorithms so that the program can automatically place the bits so that the wavelets won't be altered. This program is only 90% accurate, that means that one out of every ten imagaes with hidden bits don't set off an alarm. The only reason to get nervous is if the software is 99% accurate. 10% inaccuracy means that it's very easy to circumvent.
If god had intended you to be naked, you would have been born that way.
"responsible" (stenographically hiddedned message informing his CIA controller that he as "p"eed on O B-L.)
I went to a presentation about a month ago from a student who had been working at this project at dartmouth.
He was talking about several projects, and he hesitated to talk about this one, because their method of detection using wavelets(as described in the article) had just been bypassed by a method that uses wavelets to hide the data...
The article doesn't mention any of this, but maybe this is the 10 percent they can't detect.
The 1 minute explanation of entropy signature analysis is that it seeks to quantify in R^(n+m) space, the statistical properties of a stream of data by applying n statistical tests to the data. How well or poorly the data passes these tests helps identify the method of generation.
I'm curious about this statement. Assuming a truly random number source, an excellent encryption system, and removing any identifying marks (header, etc.), a cryptographic string should be indistinguishable from random data. Any given byte should statistically appear the same number of times as any other, any pattern should appear the same number of times as any other pattern of the same length. Is there some important mathematical precept I'm missing or are you merely talking about the idiosyncrasies of convention algorithms?
In case anyone was wondering why I spend time working with LavaRnd, cryptographically strong PRNGs, Lava Lite ® lamps and other random oddiments
When I came across the original SGI Lava Lamp number generator so long ago, I thought it was one of the coolest things around. I have yet to come across something that could generate as random a number in as closed a space... cool stuff.
First thing you learn in any intro CS course: bit string have NO inherent meaning.
Actually the best legal defense against having any copyright mp3, DECSS source, or porn imbedded in a family photo on you HD is that it is just a very long integer. It's not my fault that you choose to interpret it as something else.
The tricky bit isn't that 10% of the tampered files are undetected. It's that 10% of the vastly larger pool of untampered files are falsely detected.[1]
For instance if 5% of the source files have been tampered; and the chance of false negatives and positves are 10% each, then 14% (=.05*.9+.95*.1) show up as a positive. So, given that the test shows positive, there is less than only a one in 3 chance that the file actually was tampered with. In practice the rate of tampering is much lower than 5%; and the conditional probability of a tested positive indicating an actual positive will be correspondingly lower.
[1] The article didn't distinguish these error types and the 90% may only refer to the false negative rate.
to compose messages in bricktext. It's an oddly
amusing linguistic exercise and I find it to not
be nearly so difficult as one might expect. But
there are also those maniacs who have chosen (as
I have not) to take it further, by combining the
technique you used earlier with that which I now
use to produce the dreaded acrostic bricktext, a
feat which sane men fear to even consider. Even
that is not the true pinnacle of this black art.
The ultimate feat, and one which I have seen but
a single time, is the double-acrostic bricktext,
which bears one message in the first position of
each line and another in the line's last letter.
Very interesting stuff, thanks for the info. I have a feeling that if I want to get much more in depth, I'll have to start reading equations.
Gotta love moderating. My original article stays a 1 and your reply gets a 2.
Well, if it makes you feel any better, it's a 2 because I used my bonus point (high karma) and not because some moderator thinks I'm really smart. My original post was at 5 yesterday and got knocked down to 4 today as "overrated." Perhaps I'm being punished for not specifying lossy compression, oh well.
Maybe moderation scores are a good source of random noise?
You know that isn't the case. Most moderation is pretty easy to predict. There's just enough randmosity to keep things interesting - it'd be boring if only reasonable, knowledgable people got modded up.
1) Take the first letter of each line.
2) Take the first work of each paragraph.
SO how long till they can take this sniffer of theirs and make it SEEM like my vacation pictures contain terrorist espionage messages? i mean, with the laws of randomality, one *could* find whatever they wanted to see in any random image just by arranging the bits to their tastes.
I had a rather witty post ready with ASCII art with the words, "must kill president" hidden in the image....but /. won't let you post ASCII art. Gets this message: Lameness filter encountered. Post aborted.
Kind thoughts do not change the world
She lives next door to me in my apartment building. I can hear her tapping away on her typewriter at all hours of the day and night. I think she is recording everything I say. Damn, she's doing it again -- when will the madness end?
Oh, you said steganography... Never mind.
"He was a wise man who invented beer." -- Plato
There's that stenography tool, Outguess, that claims it can hide info into a pic without changing the pic's statistical properties (entropy et al, I surmise). I wonder if it's Outguess that makes false (or misinformed) claims, or if Prof. Farid's research on statistical analysis is already out of date...
Personally, no matter what, I wish Prof. Farid a lot of luck. His work might be what will save our collective ass from SDMI-like schemes down the road.
-- B.
This sig does in fact not have the property it claims not to have.