Distinguishing Encrypted Data From Random Data?
gust5av writes "I'm working on a little script to provide very simple and easy to use steganography. I'm using bash together with cryptsetup (without LUKS), and the plausible deniability lies in writing to different parts of a container file. On decryption you specify the offset of the hidden data. Together with a dynamically expanding filesystem, this makes it possible to have an arbitrary number of hidden volumes in a file. It is implausible to reveal the encrypted data without the password, but is it possible to prove there is encrypted data where you claim there's not? If I give someone one file containing random data and another containing data encrypted with AES, will he be able to tell which is which?"
Trick question! It is random text that's been encrypted!
Encrypted files have maximum entropy, just like absolutely random files. Basically, you can't tell which one is which. However, absolute random noise on a disk isn't all that usual, so any encrypted file (or pure random file) will stand like a sore thumb: it will be highly visible. But, again, you can't tell the difference.
cpghost at Cordula's Web.
As far as I know finding patterns in the output is tightly linked to reducing the number of possible keys, so good encryption algorithms should not create patterns. Of course if your encryption software writes some kind of header - which wouldn't affect the security of the encrypted contents - then it will be obvious to anyone looking that you have an encrypted container. So this is 99% about implementation and 1% about encryption algorithms.
Live today, because you never know what tomorrow brings
Does the person to whom you give these two files have a rubber hose? Is he a member of the “extraordinary rendition” team?
The point of steganography is to not get caught in the first place. If you need plausible deniability, you’ve already lost.
Cheers,
b&
All but God can prove this sentence true.
Weird. I guess I there's a bug in my ROT13 implementation. If I run my text through twice, I just get the original message.
The CB App. What's your 20?
AES is designed to be a pseudo-random function (meaning it's evaluated against that criteria). What this means is that /when used properly/ AES encrypted data should be indistinguishable from random data, at least for a distinguisher running in bounded time. If anyone discovers an efficient algorithm that can distinguish this, it'll be a big nail in AES's coffin (and yes, at the very theoretical level I realize that there already are some known weaknesses in AES, but for the moment you're in good shape).
I think you're missing the point. Of course after they know that you have some encrypted data on your disk the strength of the encryption becomes moot because they can just drug / beat you until you tell them the key, but what this question is about is hiding encrypted data in unencrypted data so prying eyes can't tell if anything is even there at all.
For example, there may come a day when airport security could demand you disclose your passwords when they find you are carrying storage with encrypted content using the aforementioned techniques, but they aren't going to drug / beat every single person coming onto an airplane or going across a border. If your jpgs look like everybody elses jpgs both visually and under close analytical scrutiny they aren't going to bother you. Another example is there may come a day when any traffic on the Internet that cannot be positively identified as a common protocol with statistically "normal" contents is simply rejected. Maybe not here, maybe not right now, but this kind of idea is still very useful.
It depends what you call an 'encryption algorithm'. If you mean 'DES', then no - DES is nowadays considered a weaker algorithm. If you mean 'AES-256', then still no - you need to *apply* AES-256 before it's any good, because AES is a block-cipher and will re-encrypt identical blocks of plain-text with the same key to identical blocks of ciphertext. If you mean 'AES-256 in CBC mode with random IV and SHA-256 HMAC authentication', then that's an algorithm that can be safely used. Under certain real-world circumstances.
Religion is what happens when nature strikes and groupthink goes wrong.
Hard to say from your question, but if you haven't done already, get yourself some crypto knowledge. Crypto is hard, there is a reason that you are laughed out of the room if you say you've invented a new crypto algorithm and you don't already have strong credentials.
Randomness is one of the harder computer problems. Especially in steganography, many implementations have been defeated by creating not enough or too much randomness. If you want to hide your message in something, it doesn't matter if your output is distinguishable from randomness, it matters if it is distinguishable from what should be there. Simple approaches like LSB tricks have often fallen because those happen to be not random in many input data.
Assorted stuff I do sometimes: Lemuria.org
Weird. I guess I there's a bug in my ROT13 implementation. If I run my text through twice, I just get the original message.
Just do what they did with DES... use 3rot13 and you're much more secure than the original implementation.
Karma: SELECT `karma` FROM `users` WHERE `userid`=138474;
If you use AES in ECB mode, then the answer is that it's usually painfully obvious that the original data was structured.
If you do use chaining (CBC, or something similar), then it will look quite random.
Excellent example here: http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation#Electronic_codebook_.28ECB.29
Am I part of the core demographic for Swedish Fish?
Try to get your head around the idea that they might have possession of your hard disk but not have possession of you. Or they don't even know who you are. Or they are honest cops, trying to determine if you have violated the rules. They've asked you if there is encrypted data on the laptop, you said no, and they are doing a routine check to verify that. Contrary to popular opinion, "The Man" is not always ready, willing, and able to administer a beating.
Then there is the possibility that your opponent is not "the Man" but some sort of furtive criminal...
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
"Did I miss the point or do we need the drugs and wrench?"
You missed the point. The primary question of the OP is this: "...is it possible to prove there is encrypted data where you claim there's not?"
Hint: Include the likelihood of false-positives and false-negatives in your "wrench-based" analysis.
We know where leadership by an anti-intellectual "strongman" who scapegoats minorities and likes boisterous rallies goes
Looks Welsh...
Steganographic attempts are considered foiled if someone can detect that there is a secret message, they don't need to be able to retrieve the message in order for the attempt to be considered a failure. I did my Master's project on hiding data in the least significant bitplane of imagery. The trick is to "randomly" scatter your secret message throughout this plane. I showed methods that would allow you to do this so that the data was indistinguishable. You should always encrypt your secret message first so that it looks random, or better yet, shape the statistics of your encoded message to match the noise characteristics that were in the original LSB plane. If you use an image created from a very noisy source, such as a digital camera, and you encrypt the embedded message and scatter it using a reversible algorithm, and iteratively ensure that the statistics of the altered LSB plane look the same as the original LSB plane, I proved that it is not possible for someone to tell that there is a secret message hidden there. However, you need to be careful to use an original image you created yourself, and to destroy the original, because if someone ever compared the original to the one with the embedded message, they could definitely tell there was something altered by comparing the LSB planes.
All the investigators need to do is run some fake but seemingly complex program that looks at the file under inspection and says "yes, stenography in use". Then the full weight of the law comes down, because now the suspect has to prove the negative - impossible of course.
So actually what is needed is a suspect's right that investigators prove any assertion that files have been hidden if that assertion/analysis is used as evidence in court.
If you find a file on my hard drive with data you can't readily decode, is it:
A) Compressed with an unknown compressor
B) Encrypted with an unknown encryptor
C) Random bytes used for an encryption process
D) Random bytes used for something else
I can't prove that answer D is wrong... but I don't have to because I know that 99% of the time, it's one of the other answers.
If you want to hide your data, the file must ostensibly have some other purpose... something that isn't obviously a lie. That's what steganography is about. For example, you might download as much of the 1 meter-resolution Google Maps satellite image as fits on your hard disk, save it uncompressed and then store encrypted data in the low-order bit of each byte (3 bytes to a pixel). Coupled with a map application that can display the imagery, it would appear to be one thing (a map) while really being another (a container for encrypted information).
At that point, unless you capture the encryption software it becomes hard to suspect that there is encrypted data, let alone prove it.
Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
I don't work for any 3-letter agency and even I could easily get the information needed.
With the right tools.
Mit der Dummheit kämpfen Götter selbst vergebens
they aren't going to drug / beat every single person coming onto an airplane
If you fly US Airways, there's a $25 service charge if you want to get beaten and drugged before boarding. I remember when that shit used to be included in the base ticket price.
Rubberhose (Pronounced Marutukku) is transparently deniable encryption, developed by (among others) Julian Assange.
This seems to do exactly what you're trying to do, so even if you want to go ahead and implement it yourself from scratch, it's worth reading up on what they've done to get some ideas and avoid some potential pitfalls.
Specialist Mac support for creative pros, Melbourne
If your jpgs look like everybody elses jpgs both visually and under close analytical scrutiny they aren't going to bother you.
I've developed a fascinating algorithm for encoding hidden data by slightly modulating breast sizes, but this comment is too small to contain it.
I don't care if it's 90,000 hectares. That lake was not my doing.
Not true. You are subject to the jurisdiction of the nation of registry of your craft.
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
DES is nowadays considered a weaker algorithm
DES is considered too weak for many uses due to its small key size.
Nonetheless, if you can find a way to reliably distinguish DES output from random bits, without knowledge of the key and with remotely-practical efficiency, you can publish a paper that will gain you substantial name recognition among the world's cryptographic elite.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
There is no question is is computationally difficult, just not computationally impossible. The reason for that lies in computational complexity theory. You can get a basic summary of the theory here.
In summary, if the data being encrypted is compressible by practically any algorithm whatsoever, it has computational complexity less than its bit length, i.e. a smaller bit string can be used to recover the larger one. Likewise, the computational complexity of any encryption key is at most the length of the key in bits.
Suppose you are encrypting a 256K bit string that can be compressed by a factor of two by an _ideal_ compressor. And then you have a maximally random 1K bit key. The maximum Kolmogorov complexity of any finite deterministic function of those two inputs that is known to the attacker is 129 Kbits. Where the maximum computational complexity of a truly random string of the same length as the input is 256 Kbits.
The difference between those figures opens the door to a statistical attack, because the data is not _really_ random. It just looks that way, sort of like the output of a pseudo random number generator, which isn't really random at all. If you encrypt a string of zeroes with a significantly shorter key, the output of a pseudo random number generator is exactly what you will get, a pattern that is maximally vulnerable to attack.
The lesson to be learned here is if you want to minimize the risk of attacks from folks with far more computer power than you have available, compress it using the best available compression algorithm first. Then the computational complexity of the input string will approach the theoretical maximum for a string of that length, depending on how good the compressor is. Throughout I mean complexity relative to what the attacker knows (such as the encryption algorithm) of course.
If you find a file on my hard drive with data you can't readily decode, is it:
A) Compressed with an unknown compressor
B) Encrypted with an unknown encryptor
C) Random bytes used for an encryption process
D) Random bytes used for something else
I can't prove that answer D is wrong... but I don't have to because I know that 99% of the time, it's one of the other answers.....
OK, let's, as a community, add an (E). Everyone create a file on your laptop, in your home directory, named random.bin, as follows:
dd if=/dev/urandom of=random.bin bs=4096 count=10000
The actual value of the count isn't important, as long as it is large enough to create lots of random bits. If lots of people do this, we have “(E) Random bytes because Slashdot told me to”, providing plausible deniability for anyone who needs to use that file to encrypt something important.
But a good defense attorney would apply the same principle to show that the prosecution's legal submissions were really steganography hiding insults to the judge's mother.
Socialism: a lie told by totalitarians and believed by fools.
You tell them you just visited your cousin Jim, who had an old hard drive he didn't want anymore, and you needed a spare so he gave it to you, but not before he ran "dd if=/dev/urandom of=/dev/sda1" because he didn't want you having his old tax documents.
And now you have just fallen victim to a classic interrogation technique. They have just gotten you to tell a story that then can investigate and determine its credibility. They will talk to your cousin Jim; they will look for signs of an OS installation at the date and time you said. They then ask more follow up questions (for which they already know the true answer) to get you to dig a bigger grave for yourself. Then they show you that they know you are lying and inform you of the penalty for that crime and offer you a "deal" to tell the truth.
The fact is that when you are dealing with good interrogators, you cannot lie your way out of it. If you have a huge file full of random data, that is suspicious and there is nothing you can say to change that. The whole point of steganography is to hide the data in something innocent so that no one ever asks you anything. The goal is to blend in and give them no reason to give you a second though.
Dammit, I finally get cthulu back to sleep and some jackass wakes him up again.
I read TFA and all I got was this lousy cookie
It's super easy to make up a key. XOR = key.
I read TFA and all I got was this lousy cookie