Collage, and the Challenge of "Deniability"
Slashdot regular Bennett Haselton has written a piece on a new program called Collage that can circumvent censorship by embedding messages in user-generated content on sites like Flickr. The program demonstrates that a long-standing theoretical concept can be reduced to practice but Bennett wonders if anybody would actually need it, as long as they can exchange encrypted messages over Gmail and AIM. He begins "In a presentation delivered at USENIX, Georgia Tech grad student
Sam Burnett and his colleagues
described how their new program, "Collage," could circumvent Internet censorship
by embedding messages in user-generated content on sites like Flickr. The short
version is that a publisher uses the Collage system to break a message into pieces
that are small enough to embed into a photograph using standard
steganography,
the photos are published according to some protocol (e.g. "all
photos in the photostream of user xyz" or "all photos tagged with the 'xyz' tag"),
and receivers who know the protocol for identifying the photos, can retrieve
them and decode the message. According to the authors'
paper, the system is general
enough that it could be adapted to almost any site where user-generated content is
published. (All of this can be done by hand using existing tools, but Collage automates
the process to hide the individual steps from the user.)"
From this short description, you can see the two salient facts about Collage:
(1) it's robust, in the sense that in order to shut it down completely, the censor
would have to block every site containing user-generated content; and (2) it's
efficient only for small text messages (which is what the authors used to test it),
and not for high-bandwidth communications such as video. The authors have also
highlighted the claim that Collage is (3) deniable, in the sense that in using it,
you won't attract the attention of the censors for browsing "innocent" sites like
Flickr. On this point, I'm not so sure; I think it's highly dependent on the kinds
of publication system that the sender and the recipient agree on. For example, if the sender
publishes their messages in photos all in one user's photostream, and that photostream
is used primarily by recipients in censored countries to receive encoded messages,
and if virtually nobody ever visits that photostream for any other reason, then if
the censor ever finds out about that photostream, they could flag any user who ever visits it.
It doesn't matter if the "site" as a whole is "innocent", if that one user's
photostream is not.
But there's a more fundamental issue: Currently, in all censored countries, there is at least one way to receive prohibited text messages more efficiently (and with greater deniability) than with Collage. So Collage may work perfectly, but even when it gets released, I'd be very surprised to see large numbers of people using it unless all the simpler alternatives get blocked.
Most tools that people use to circumvent Internet censorship, are not "deniable" in the sense described above. If you visit a proxy site like VTunnel, any censor who is monitoring your Internet connection can see that you connected to a known proxy site. If you connect to the proxy site using "https://" instead of "http://", then a censor eavesdropping on your connection, won't be able to tell what you looked at through the proxy site (unless they confiscate your computer and look through your browser history), but they'll still be able to tell that you visited a proxy site. Similarly, if you use a tool like UltraSurf or Tor, those tools can circumvent the censor's filters by re-routing your Internet connection through a server outside the censored country -- but a censor monitoring your traffic, can still see that you connected to an UltraSurf or Tor server outside the country, even if they can't tell what Web sites you were visiting.
But if all you want is to receive short text messages, then there are many options that are completely "deniable." The simplest is probably to use Gmail and to choose the option to always read messages over https://. (If you sign in to Gmail, under "Settings" you can choose between "Always use https" and "Don't always use https".) If you read your inbox contents using https, then a censor eavesdropping on your connection can't see anything at all -- not the contents of messages that people send you, not the email addresses of people who are writing to you, not even the username that you use to sign in to read your Gmail messages. This gives you more or less perfectly deniability. As long as many Gmail users are using Gmail over https://, then doing this by itself would not attract undue attention from censors monitoring your Internet traffic. Using Gmail, you could also exchange higher-bandwidth content like images and video (up to Gmail's attachment size limit, currently 25 megabytes), something not possible with Collage.
Of course, if you remember the case in which Yahoo turned over information about one of its Chinese account-holders to the Chinese government (who subsequently arrested the user and sentenced them to 10 years in prison), you may be wary of trusting any Western corporation with your privacy. But in this case, you wouldn't have to. Because even if the Chinese government found out that some Gmail users were using Gmail to receive anti-government messages from the U.S., the censors wouldn't be able to eavesdrop on https-protected connections to find out which users were receiving the messages or what they said, so there would be no information for them to demand that Google turn over to them.
Or if you want to exchange encrypted text messages in real time, you can use any instant messaging client that supports encryption. Whether or not this is "deniable", in the sense of not attracting undue attention for "suspicious activity", depends on what proportion of other users are using the chat program in encrypted mode as well. The current version of AOL Instant Messenger, for example, apparently encrypts all instant messages by default. (Although you should take care to understand exactly what is "encrypted" when using an instant messaging client. In my experiments, when using AOL Instant Messenger, the contents of messages were encrypted, but the specific screen names that you're sending and receiving messages from, are not. In other words, a censor eavesdropping on your traffic, can see which screen names you exchanged messages with, but not the message contents. So if there were an AOL user account in a non-censored country that was a dummy account used primarily for passing banned information to users in censored countries, then if the censors ever found out about that account, they could flag and investigate any user in their country who exchanged messages with that screen name.)
The bottom line is that as long as at least one of these alternatives remains unblocked in your country, they would serve as an easier way to achieve the same goals that Collage achieves. They're generally faster, more convenient, and most of the time, more "deniable", in the sense that the traffic they generate won't look as suspicious as, say, browsing a Flickr feed that later becomes widely known as source of banned encoded messages. Collage does demonstrate that an interesting idea can be reduced to practice, and is robust in the sense that the general scheme cannot be blocked unless a regime blocks access to every site hosting user-submitted content. But there doesn't seem to be a compelling reason to use it unless and until all of the simpler methods get blocked.
I write all of this as someone who also wrote a program a few years ago that was meant to serve as a more robust back-up, in case a more popular method of circumventing censorship ever got shut down by the censors. In my case, I thought that most censoring regimes would start blocking all popular Web proxy sites, so I wrote an install script called "Circumventor" that would let you set up a Web server and James Marshall's CGIProxy script on your home computer, turning it into a mini-Web-proxy site. I assumed that eventually, most people in censored countries would have to rely on someone in a non-censored country to set up a private Web proxy like this and e-mail them the URL, once China and Iran got their act together and started blocking most publicly known Web proxy sites. But that never happened, partly because Web proxy sites are now springing up faster than most censors' databases can keep up with. So the web proxy install script fell by the wayside -- but that's good news, because it means that nobody really needed it, since the simpler, more straightforward methods continued to work. Why pester your cousin in the U.S. to set up a Web proxy for you, when most Web proxies you can find in Google are not even blocked yet?
And so it goes for Collage. It sounds like a perfectly fine idea, and it will be great news all around if nobody ever actually has to use it, because the censors never get around to blocking all of the simpler alternatives.
But there's a more fundamental issue: Currently, in all censored countries, there is at least one way to receive prohibited text messages more efficiently (and with greater deniability) than with Collage. So Collage may work perfectly, but even when it gets released, I'd be very surprised to see large numbers of people using it unless all the simpler alternatives get blocked.
Most tools that people use to circumvent Internet censorship, are not "deniable" in the sense described above. If you visit a proxy site like VTunnel, any censor who is monitoring your Internet connection can see that you connected to a known proxy site. If you connect to the proxy site using "https://" instead of "http://", then a censor eavesdropping on your connection, won't be able to tell what you looked at through the proxy site (unless they confiscate your computer and look through your browser history), but they'll still be able to tell that you visited a proxy site. Similarly, if you use a tool like UltraSurf or Tor, those tools can circumvent the censor's filters by re-routing your Internet connection through a server outside the censored country -- but a censor monitoring your traffic, can still see that you connected to an UltraSurf or Tor server outside the country, even if they can't tell what Web sites you were visiting.
But if all you want is to receive short text messages, then there are many options that are completely "deniable." The simplest is probably to use Gmail and to choose the option to always read messages over https://. (If you sign in to Gmail, under "Settings" you can choose between "Always use https" and "Don't always use https".) If you read your inbox contents using https, then a censor eavesdropping on your connection can't see anything at all -- not the contents of messages that people send you, not the email addresses of people who are writing to you, not even the username that you use to sign in to read your Gmail messages. This gives you more or less perfectly deniability. As long as many Gmail users are using Gmail over https://, then doing this by itself would not attract undue attention from censors monitoring your Internet traffic. Using Gmail, you could also exchange higher-bandwidth content like images and video (up to Gmail's attachment size limit, currently 25 megabytes), something not possible with Collage.
Of course, if you remember the case in which Yahoo turned over information about one of its Chinese account-holders to the Chinese government (who subsequently arrested the user and sentenced them to 10 years in prison), you may be wary of trusting any Western corporation with your privacy. But in this case, you wouldn't have to. Because even if the Chinese government found out that some Gmail users were using Gmail to receive anti-government messages from the U.S., the censors wouldn't be able to eavesdrop on https-protected connections to find out which users were receiving the messages or what they said, so there would be no information for them to demand that Google turn over to them.
Or if you want to exchange encrypted text messages in real time, you can use any instant messaging client that supports encryption. Whether or not this is "deniable", in the sense of not attracting undue attention for "suspicious activity", depends on what proportion of other users are using the chat program in encrypted mode as well. The current version of AOL Instant Messenger, for example, apparently encrypts all instant messages by default. (Although you should take care to understand exactly what is "encrypted" when using an instant messaging client. In my experiments, when using AOL Instant Messenger, the contents of messages were encrypted, but the specific screen names that you're sending and receiving messages from, are not. In other words, a censor eavesdropping on your traffic, can see which screen names you exchanged messages with, but not the message contents. So if there were an AOL user account in a non-censored country that was a dummy account used primarily for passing banned information to users in censored countries, then if the censors ever found out about that account, they could flag and investigate any user in their country who exchanged messages with that screen name.)
The bottom line is that as long as at least one of these alternatives remains unblocked in your country, they would serve as an easier way to achieve the same goals that Collage achieves. They're generally faster, more convenient, and most of the time, more "deniable", in the sense that the traffic they generate won't look as suspicious as, say, browsing a Flickr feed that later becomes widely known as source of banned encoded messages. Collage does demonstrate that an interesting idea can be reduced to practice, and is robust in the sense that the general scheme cannot be blocked unless a regime blocks access to every site hosting user-submitted content. But there doesn't seem to be a compelling reason to use it unless and until all of the simpler methods get blocked.
I write all of this as someone who also wrote a program a few years ago that was meant to serve as a more robust back-up, in case a more popular method of circumventing censorship ever got shut down by the censors. In my case, I thought that most censoring regimes would start blocking all popular Web proxy sites, so I wrote an install script called "Circumventor" that would let you set up a Web server and James Marshall's CGIProxy script on your home computer, turning it into a mini-Web-proxy site. I assumed that eventually, most people in censored countries would have to rely on someone in a non-censored country to set up a private Web proxy like this and e-mail them the URL, once China and Iran got their act together and started blocking most publicly known Web proxy sites. But that never happened, partly because Web proxy sites are now springing up faster than most censors' databases can keep up with. So the web proxy install script fell by the wayside -- but that's good news, because it means that nobody really needed it, since the simpler, more straightforward methods continued to work. Why pester your cousin in the U.S. to set up a Web proxy for you, when most Web proxies you can find in Google are not even blocked yet?
And so it goes for Collage. It sounds like a perfectly fine idea, and it will be great news all around if nobody ever actually has to use it, because the censors never get around to blocking all of the simpler alternatives.
John has a long moustache.
And now, an oldie but a goodie: I'll be home for Christmas.
if the Chinese government found out that some Gmail users were using Gmail to receive anti-government messages from the U.S., the censors wouldn't be able to eavesdrop on https-protected connections to find out which users were receiving the messages or what they said, so there would be no information for them to demand that Google turn over to them.
In this case, I'd say the Chinese government would already have the IP address of the party in question, and the time span(s) during which they connected to Yahoo (or Yawhoever) via https. Seems to me that's plenty of information for them to go knocking on Yahoo's door and demand full session details.
If libertarians are so opposed to effective government, why don't they all move to Somalia?
The summary says "...as long as they can exchange encrypted messages over Gmail and AIM."
That's a pretty tall order if you are in the type of situation where you need to do that because of censorship. Even in the US (which I would call average good in regards to exchanging ideas freely there were efforts to block/slow down encrypted communications (DES, http://en.wikipedia.org/wiki/Data_Encryption_Standard). If you are somewhere where the protection of encryption for "legitimate" concerns (like discussing why your brother whom held up a sign disappeared), I am willing to bet use of crypto is not safe. It makes far more sense to put crypto messages into stenography such as this. I know I would if I was sending encrypted messages out of fear of the content of my conversation.
Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
I see, you are hiding secret text in Slashdot troll messages.
The Tao of math: The numbers you can count are not the real numbers.
Ok but how to do you communicate the "protocol" to your audience which may be scattered around the globe? And how do you guarantee communicating the "protocol" hasn't been compromised? As soon as the "protocol" is discovered it becomes easy to begin censoring again. I suppose it could work if you could be face to face with the person you're trying to communicate with and manually give them the "protocol" but if you can do that then you can just exchange public keys too and use the standard public key cryptography setup.
I came to the datacenter drunk with a fake ID, don't you want to be just like me?
Obviously, what they need to do is apply this technique to embed the message in spam messages, in the random dictionary garbage or images in the spam. The recipient then just has to know which spam messages to check for the hidden messages.
Now, we just need someone to do this to show how to smuggle information in/out of the major spam-email-producing countries, and perhaps there will suddenly be more interest in shutting down spammers.
The false assertion is that because gmail and other email can be fully encrypted that the CCCP/"surveillance state of choice" will have no information upon which to demand information. This is false as long as gmail and others track IP addresses, and they do for data-mining and advertising purposes.
You can also use spam to send your coded message, most people never see it or delete it immediately, while the targets get their info while not being singled out for having visited a particular site etc.
how the hell is having encrypted messages in your email account "deniable"? It seems like the whole premise of this article is that "Tha Goog won't give you up, man!". If the "censors" can get yahoo to hand it over, google will too.
The whole point of collage is that nobody knows if there is data hidden in the images or if they're regular old images. i.e. the only person that can "hand over" the data is the sender or receiver, none of the middle-men.
People who have no idea about what they're talking about should shut up.
The Russians were already doing it. It was recent news...
Would it not be easier to just use a simple replace/exchange code? Like replace 'overthrow' with 'support', 'government' with 'recovery', and 'nation' with 'sickness': "I will completely support the recovery of this sickness.".
Or a message where someone only read every 8th word to form a secret message. Or do both and combine with an acrostic. There are much easier ways to hide a message.
I judt got a nre Kinesis keybiartf so please excusr ant egregiou typos.
a Blackberry version would be useful for people living in Saudi Arabia, UAE, India and most importantly, the US.
How about a web based client interface for browsing encrypted content that is dispersed throughout the web to increased readership of closed circle content and a trust system for automatically sharing access to friends?
Open Standards Portal
There seems to be a disconnect here in the author's understanding of "confidentiality" and "privacy". It is important to recognize that there mere existence of communication is often actionable information, and SSL/IPsec do nothing to prevent such leakage. Systems like Collage are absolutely necessary to fill in these gaps.
"John has a long moustache"
Now lets see does that mean "go and blow up the telephone lines" or does it mean "Lord Lovett is having a party on Sword beach tomorrow morning, bring your own champagne".
I for one welcome our June 6th OVERLORD
The false assertion is that because gmail and other email can be fully encrypted that the CCCP/"surveillance state of choice" will have no information upon which to demand information. This is false as long as gmail and others track IP addresses, and they do for data-mining and advertising purposes.
google is big bro
the last thing to use is google anything
Obviously the true reason for 9/11 was to have lots of video footage distributed world wide, in which they could hide messages. :-)
The Tao of math: The numbers you can count are not the real numbers.
The problem with steno is that the program has to leave footprints in the image file so it can extract the encoded text. If the BBG (Big Bad Government) knows what those footprints look like they can search the web for images that contain them. After 9/11 there was a lot of interest in terrorists using steno to communicate, so someone decided to search the whole Internet for images with known steno identifiers. Now where did I read about that...oh yeah: http://slashdot.org/yro/01/09/26/1418252.shtml
"I'm not a quack, I'm a mad scientist! There's a difference." - Dr. Cockroach
Collage solves a different problem from encripted email.
The purpose of encripted email is to protect the content of your message, but it does nothing to hide the fact that you sent it and to whom you sent it. Collage on the other hand obscures whether a message has been sent and from/to whom, but does little to obscure the content.
If you are in a situation where it is illegal for you to be sending your message, sending an encripted email is just asking for the MiB to show up at your door and demand the decription key (at which point you're screwed).
Some cryptography 101:
Plausible dependability in cryptography means that even if someone suspects there is hidden encrypted data in a data set, they can't prove it, even if they have full knowledge of the protocol.
What is presented here is automated steganography over image sites with many users (hiding the information). If the surveillance entity intercepts such messages and analyses them, they will know that *something* is there, though they won't be able to read it.
Anyway, what it boils down is, that you can't just say there is no message if someone confronts you, and this might very well lay the foundations for your gravestone in countries where the governing entities have a somewhat undemocratic method of dealing with things.
On the other hand, if they don't like you, and really suspect you are up to no good, they will probably shoot you anyway, evidence or not.
As long as google remains as not being a paramilitary organization I'll trust google over any government/ISP. At least they take a stand when asked to turn over user data.
Basically we're living in a world where you have to choose one big brother or the other. So I choose Google, the fuzzier cuddlier googlier one.
From TFS "it's robust, in the sense that in order to shut it down completely, the censor would have to block every site containing user-generated content"
Well, not Collage is actually weak as hell - because it's Achilles's heel is the need to transmit the protocol between all users involved. If the authorities believe the people they are closing in on are using Flickr (to choose just one example), all they need to do is block Flickr to force them to communicate outside of that channel and potentially reveal themselves further. Furthermore, users are lazy - so examining their usage patterns (I.E. seeing if they always select tag 'xyz' on Flickr) will commonly reveal they they are doing something unusual and thus likely to be Up To Something.
Communications security involves a lot more than just encrypting the messages. It's a complex task with a whole hell of a lot of ways to screw up - and seemingly small slips can have vast potential consequences.
One should probably trust a random steganoraphy program even less than a random encryption program, and I am not sure if there ever was a commonly accepted *good" steno program (on par with, say, PGP).
One possible difficulty in, say, hiding messages in low-weight bits ("noise") of digital pictures that I recently thought of (combination of my work and reading that particular thread you referenced) is that they are produced by a physical object (digital camera sensor), with noise likely to Boltzmann-distributed at, say, 300K. If a program sees just white noise there, or some much higher or much lower effective T, well, immediate red flag!
Now, it is probably possible to take effects like this into account when designing your program, but it would take someone well-versed both in math of crypto AND physics of sensors, which is obviously somewhat higher threshold, and it might end up not being "universal" for different image sources.
Anyway, just my $0.02
Paul B.
If you connect to the proxy site using "https://" instead of "http://", then a censor eavesdropping on your connection, won't be able to tell what you looked at through the proxy site (unless they confiscate your computer and look through your browser history), but they'll still be able to tell that you visited a proxy site."
It's quite trivial to figure out what site someone is visiting inside an encrypted stream:
https://ng.gnunet.org/bn/pet05-bissias
Also, there are 650 CA's, and if even one of them has been co-opted by a hostile government, SSL is useless:
http://www.eff.org/observatory
28,000 keys with the Ubuntu entropy bug? Worthless.
So long as you can visit 4chan and the like without being innately suspicious, LOLCats = Evil pedophile terrorist secret messages?
Not "FrEe ViAgRa"
But I thought it'd be more fun to actually send steganographic stuff, so I coded up a little bit of stuff in matlab (what I was using at the time) that merged a jpeg and a stream of ascii, alternately adding and subtracting the bits of the ascii from the jpeg values. The resulting pictures looked just like pictures: it wasn't visually obvious.
Then I'd post the unmodified pictures in an unlinked directory on my website (this was pre-flickr) so she could download the originals and subtract out the difference.
This would have been easily defeated by the chinese firewall just re-encoding jpegs that passed through to a slightly different size or quality, but they never did so it worked fine. But it was a pain in the butt to actually *use*.
But it'd be even more of a pain in the butt to detect.
Nostalgia's not what it used to be.
Deniability isn't hard in US, if you're a politician that is. You only have to deny, deny, deny. Even when presented with proof of your wrong-doing or mis-speaking, you still need only deny it. The media won't call you on it and the people, not knowing who to believe, will give you the benefit-of-the-doubt and just let it go. I laugh when any politico/spy movie mentions "Plausible Deniablility", it's such an antiquated concept.
If you read your inbox contents using https, then a censor eavesdropping on your connection can't see anything at all -- not the contents of messages that people send you, not the email addresses of people who are writing to you, not even the username that you use to sign in to read your Gmail messages.
Oh really? Are you sure?
First off, the trustworthiness of a group is inversely proportional to its size, so any protocol with "broadcast" in its description is certainly insecure. Collage will only work if it's a private channel between two or three people.
Even then, it's relatively weak. Stego needs to be proof against a determined attack by an expert who suspects it's being used and knows the protocol. The standard safeguard, which is not in Collage, is to first encrypt and then hide.
I'm a Programmer. That's one level above Software Engineer and one level below Engineer.
Usenet spam and porn? The message recipient is indistinguishable from 10,000 horny teenage porn surfers while government authorities can only keep one hand on the keyboard at a time.
At the end of the day, Google doesn't have men with guns who can kick in your door, shoot your dog, and confiscate your property under color of law. They are definitely the lessor evil here.
God invented whiskey so the Irish would not rule the world.
> The commercial and freeware products today do most certainly leave traces.
To convince you that undetectable steganography is possible, think about the following algorithm (which, I admit, has a very, very low ratio of information to carrier). While generating the images I want to use for my carrier data, I set my camera to snap 250 images each time rather than 1. If the scene and the camera are at all realistic, there will be enough entropy in the sets of 250 images so that I can always (for all practical purposes) select one image out of each set of 250 images such that 4 bits of a cryptographic hash of it prefixed by a secret key is a particular nybble.
The encrypted message is then just a sequence of images, one per nybble, where none of the images has been altered in any way whatsoever, merely selected. One has to be careful, however, not to be caught with the other 249 images, and as you have also pointed out, this will not give security against traffic analysis.
Sorry if you already knew this, I see you aren't the original poster, who gave the impression that good steganography was more or less impossible.
How is this different from steghide? Here's the summary from 'aptitude show steghide':
Steghide is steganography program which hides bits of a data file in some of
the least significant bits of another file in such a way that the existence of
the data file is not visible and cannot be proven.
Steghide is designed to be portable and configurable and features hiding data
in bmp, wav and au files, blowfish encryption, MD5 hashing of passphrases to
blowfish keys, and pseudo-random distribution of hidden bits in the container
data.
Ask me about repetitive DNA
I sincerely hope nobody at risk under these conditions takes the suggestions in this article seriously.
For example, the comments regarding SSL-encrypted Gmail ignore the real possibility that the mail storage may be with government jurisdiction. The assumption that these communications are safe is ignorant and foolish.
There are many other faults. Enough to call this article dangerous misinformation.
Anyone besides me wanting to know if the length of the original post had anything to do with whether or not there was a collage-hidden message in it?
Then to have the post end with no mention....very disappointing, but I suppose the author wouldn't be able to maintain perfect deniability if they admitted to the message they just sent to their comrades...
Interesting topic, I had always assumed that gmail access over https was blocked in China, nice to know that it isn't.
So yes, using it seems simpler than steganography, especially since you can encrypt your email before sending them so that not even Google can read them:
remember that even if you trust Google as a corporation to do the right thing, spying|bribbing could still be used to access your messages stored in Google's servers.
But the end of the topic is weird, it says basically that the simplest way is still to use proxy as they are not blocked, but if I was a dictator I would NOT block proxy, just list the IP address of those who access these proxy and monitor those users who helpfully have shown that they have something to hide!!
Sure, using a special purpose server doesn't work also for deniability, but that's why you need to use a popular server such as Facebook as a middle man for the exchange of those "steganography encoded messages".
It strikes me that Gmail over https is actually a worse solution than steganography when deniability is the goal. Deniability doesn't simply mean making it impossible to read a hidden message; it also means hiding a message in a way that doesn't look like one is hiding anything. TOR, Freenet and proxy servers have the same problem. Collage seems to be a slightly Rube-Goldbergian but never the less right headed solution. How does a dissident exchange messages without appearing to do anything sneaky or out of the ordinary on the internet? I wonder if there's a means of hiding messages in the ordinary bandwidth chatter of AJAX pages.